dummy data

1 Introduction

For the preparation of Osteopathy research, dummy data is prepared and used. The objective of the this mock-up report is to show how data of Osteopathy research can be analyzed and visualized using R-script in combination with Markdown language.
The goal of the research is to measure if Osteopathy treatment in addition to physiotherapy increase the healing process of post-Covid19 patients suffering from post covid syndrome.


2 Setup

Load environment, libraries and functions.

rm(list = ls())  # remove all data from global environment.
# Set working directory
setwd("~/Documents/01_Workspace/01_Rproject/osteo")
#Load libraries 
library(tidyverse)
#library(lubridate)
library(scales)
#library(car)
#library(SpATS)
library(plotly)
#library(expss)
library(RColorBrewer)
#library(formattable)
# library(sparkline)
# library(kableExtra)
# library(heatmaply)
# library(reshape)
# library(matrixStats)
#library(RCurl)
library(DT)
#library(ggrepel)
#library(qwraps2)
#library(gge)
#library(GGEBiplots)
#library(matrixStats)



# FUNCTIONS

# Create Data table. -----
create_dt <- function(x){
  DT::datatable(x, class = 'cell-border stripe', filter = 'top',
                extensions = 'Buttons',
                options = list(dom = 'Blfrtip',
                               buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
                               lengthMenu = list(c(10,25,50,-1),
                                                 c(10,25,50,"All"))))
}

3 Study Population

3.1 Subjects

Each treatment group contain ten participants. The participants responded to a questionnaire (as described in vragenlijst_copd_ccq.pdf) before and three months after the start of treatment.

# Load tables


osteo_data <- read_delim("01_source/osteo_dummy_data.csv", 
    ";", escape_double = FALSE, trim_ws = TRUE)

osteo_data$gender <- as.factor(osteo_data$gender)
osteo_data$Treatment <- as.factor(osteo_data$Treatment)
osteo_data$Time <- as.factor(osteo_data$Time)
before <- osteo_data %>% 
  filter(Time == 'before') %>%
  select(Subject:Treatment) 

t1 <- create_dt(before) 

t1

3.2 Age

Perform T-test to check if age in study groups are the similar
H0: The mean gae in the two study groups are not significantly different.
H1: Mean age between the two study groups are significantly different.

3.2.1 Shapiro test

The null-hypothesis of this test is that the population is normally distributed. Thus, on one hand, if the p value is less than the chosen alpha level, then the null hypothesis is rejected and there is evidence that the ages are not normally distributed. On the other hand, if the p value is greater than the chosen alpha level, then the null hypothesis (that the ages came from a normally distributed population) can not be rejected (e.g., for an alpha level of .05, a data set with a p value of less than .05 rejects the null hypothesis that the ages are from a normally distributed population).

conventional <- before %>% 
  filter(Treatment == 'conventional') 

shapiro.test(conventional$age)
## 
##  Shapiro-Wilk normality test
## 
## data:  conventional$age
## W = 0.84929, p-value = 0.05697
additional <- before %>% 
  filter(Treatment == 'additional')

shapiro.test(additional$age)
## 
##  Shapiro-Wilk normality test
## 
## data:  additional$age
## W = 0.98609, p-value = 0.9894
rm(additional, conventional, t1)

The Shapiro-test provides a p-value 0.057 for the age distribution among participants receiving the conventional treatment. Ages are normally distributed.

The Shapiro-test provides a p-value 0.99 for the age distribution among participants receiving the additional treatment. Ages are normally distributed.

We can contiue with the Student T-test for the ages in both groups.

library(ggpubr)


p1 <- ggboxplot(before, x = "Treatment", y = "age",
          color = "Treatment", palette = "jco",
          add = "jitter",
          short.panel.labs = FALSE) + 
  stat_compare_means(label = "p.format", label.y.npc = "top", label.x.npc = "middle") + 
  theme_grey(base_size = 12)

p1

H0 is not rejected: With a p-value of 0.47 the mean of ages in the two treatment groups is not significantly lower of higher.

3.3 Gender

Perform Pearson’s Chi-square test to check if gender is associated with the treatment group.

Pearson’s Chi-square test: \(\chi^{2} = \sum \frac{(observed_{ij} - model_{ij})^{2}}{model_{ij}}\)

H0: The gender is not associated with the treatment group.
H1: The gender is disproportionately different between the treatment groups. .

library(gmodels)

chi <- CrossTable(before$gender, before$Treatment, fisher = TRUE, chisq = TRUE, expected = TRUE,
           prop.c = FALSE, prop.t = FALSE, prop.chisq = FALSE, sresid = TRUE, format = "SPSS")
## 
##    Cell Contents
## |-------------------------|
## |                   Count |
## |         Expected Values |
## |             Row Percent |
## |            Std Residual |
## |-------------------------|
## 
## Total Observations in Table:  20 
## 
##               | before$Treatment 
## before$gender |   additional  | conventional  |    Row Total | 
## --------------|--------------|--------------|--------------|
##        female |           3  |           5  |           8  | 
##               |       4.000  |       4.000  |              | 
##               |      37.500% |      62.500% |      40.000% | 
##               |      -0.500  |       0.500  |              | 
## --------------|--------------|--------------|--------------|
##          male |           7  |           5  |          12  | 
##               |       6.000  |       6.000  |              | 
##               |      58.333% |      41.667% |      60.000% | 
##               |       0.408  |      -0.408  |              | 
## --------------|--------------|--------------|--------------|
##  Column Total |          10  |          10  |          20  | 
## --------------|--------------|--------------|--------------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0.8333333     d.f. =  1     p =  0.3613104 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  0.2083333     d.f. =  1     p =  0.6480769 
## 
##  
## Fisher's Exact Test for Count Data
## ------------------------------------------------------------
## Sample estimate odds ratio:  0.4476084 
## 
## Alternative hypothesis: true odds ratio is not equal to 1
## p =  0.6499166 
## 95% confidence interval:  0.04558065 3.669353 
## 
## Alternative hypothesis: true odds ratio is less than 1
## p =  0.3249583 
## 95% confidence interval:  0 2.786368 
## 
## Alternative hypothesis: true odds ratio is greater than 1
## p =  0.9150988 
## 95% confidence interval:  0.06312785 Inf 
## 
## 
##  
##        Minimum expected frequency: 4 
## Cells with Expected Frequency < 5: 2 of 4 (50%)

Because the numbers in several groups is lower than 5, the Fisher test is applied.

The value of \(\chi^{2}\) was 0.21,
degree of freedom is one,

The H0 is not rejected.

There was no significant association between the type of treatment and the gender of the participant.

4 Results

Questions 1, 2, 3 and 4 are related to respiration.
Questions 5 and 6 are related to coughing.
Questions 7, 8, 9 and 10 are related to activities.

For each of these three groups the mean value is calculated and used for comparing before and after treatment.
First separate for the two treatment groups.

osteo_data <- osteo_data %>% 
  rowwise() %>%
  mutate(Q1_4 = mean(c(Q1, Q2, Q3, Q4), na.rm = TRUE)) %>% 
  mutate(Q5_6 = mean(c(Q5, Q6), na.rm = TRUE)) %>% 
  mutate(Q7_Q10 = mean(c(Q7, Q8, Q9, Q10), na.rm = TRUE))


resp <- osteo_data %>% 
  select(Subject, gender, age, Treatment, Question, Time, Q1_4)

resp <- pivot_wider(resp, names_from = Time, values_from = Q1_4)

# resp_before <- subset(osteo_data, Treatment == 'conventional' & Time == 'before', Q1_4, drop = TRUE) 
# resp_after <- subset(osteo_data, Treatment == 'conventional' & Time == 'after', Q1_4, drop = TRUE) 

# Plot paired data


resp$Treatment <- factor(resp$Treatment, levels=c("conventional", "additional"))


p2 <- ggpaired(resp, cond1 = "before", cond2 = "after", palette = "jco",
               fill = "condition",
               title = "Respiration levels before and after treatment",
               ylab = "Mean questions 1-4",
               xlab = "Treatment", 
               facet.by = "Treatment") + 
   stat_compare_means(comparisons = "condition") + # Add pairwise comparisons p-value
   stat_compare_means(label.y = 6)                   # Add global p-value

p2